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ABSTRACT 

We use subhalo abundance matching (SHAM) to model the stellar mass func¬ 
tion (SMF) and clustering of the Baryon Oscillation Spectroscopic Survey (BOSS) 
“CMASS” sample at ^ ^ 0.5. We introduce a novel method which accounts for the 
stellar mass incompleteness of CMASS as a function of redshift, and produce CMASS 
mock catalogs which include selection effects, reproduce the overall SMF, the pro¬ 
jected two-point correlation function Wp, the CMASS dn/dz, and are made publicly 
available. We study the effects of assembly bias above collapse mass in the context 
of “age matching” and show that these effects are markedly different compared to the 
ones explored by Hearin et al. (2013) at lower stellar masses. We construct two mod¬ 
els, one in which galaxy color is stochastic (“AbM” model) as well as a model which 
contains assembly bias effects (“AgM” model). By confronting the redshift dependent 
clustering of CMASS with the predictions from our model, we argue that that galaxy 
colors are not a stochastic process in high-mass halos. Our results suggest that the 
colors of galaxies in high-mass halos are determined by other halo properties besides 
halo peak velocity and that assembly bias effects play an important role in determining 
the clustering properties of this sample. 

Key words: cosmology: large-scale structure of Universe, cosmological parameters, 
galaxies: halos, statistics 


1 INTRODUCTION 

The overall picture that galaxies form, evolve, and reside in 
dark matter halos that assemble hierarchically has gained 
consensus by passing a variety of observational tests over 


* E-mail: shun.saito@ipmu.jp 


a wide range of cosmic history (for a review, see Mo et al. 
2010). However, understanding the detailed relation between 
galaxies and dark matter halos is critical in order to form 
a more concrete theory of galaxy formation and evolution. 
In particular, unveiling how the stellar masses and star- 
formation properties of galaxies depend on halo properties 
is still a topic of active investigation. For low-mass galax- 


© 0000 The Authors 


2 S. Saito et al. 


ies (M* < recent studies of galaxy clustering and 

galaxy-galaxy lensing suggest that red and blue galaxies live 
in halos of different mass at fixed stellar mass at 0 < z < 1 
(Zehavi et al. 2005; Mandelbaum et al. 2006; Tinker et al. 
2013; Coupon et al. 2015; Mandelbaum et al. 2015) or that 
at fixed stellar mass, galaxy color may correlate with halo 
age (Hearin et al. 2014b). 

While many previous studies focus on low or interme¬ 
diate mass galaxies, the galaxy-halo mass connection is also 
worth investigating for the most massive galaxies in the uni¬ 
verse. The majority of galaxies with masses M, > 10^^Mq 
are centrals hosted by massive halos (Mhaio ^ 10 ^®Mq) 
(White et al. 2011; Leauthaud et al. 2011; Coupon et al. 
2015). From a theoretical standpoint, gas in these high- 
mass halos is thought to be heated by pressure-supported 
shocks (the so-called “hot halo mode”, Dekel & Birnboim 
2006). In addition, at these halo masses, “maintenance 
mode” feed-back mechanisms such as radio-mode feedback 
are thought to further limit star-formation in the most mas¬ 
sive galaxies (e.g., Croton et al. 2006). Observationally, how¬ 
ever, not all massive galaxies are systematically “red and 
dead”. For example, although they are rare, brightest clus¬ 
ter galaxies associated with cool core clusters can exhibit 
star formation rates of order 0 ( 10 - 100 ) MQyx~^ (e-g-j in 
Abell 1835 at z ~ 0.25 and in Perseus A and Cygnus A 
at z ~ 0.1) (e.g., Liu et al. 2012; McDonald et al. 2012; 
Fraser-McKelvie et al. 2014). At group scales. Tinker et al. 
( 2012 ) found that as many as 20 % of central galaxies in ha¬ 
los with logjQ(Mhaio/Af 0 ) > 13 at z ~ 0.5 have sufficient 
levels of star formation to exhibit blue colors. A key ques¬ 
tion is then: what determines color in high mass halos? Is 
star formation in massive galaxies simply a stochastic pro¬ 
cess due to episodic amount of gas cooling and/or due to 
mergers with gas rich satellites? Or are the colors of massive 
galaxies more fundamentally linked to assembly history of 
their parent dark matter halos? 

Large spectroscopic samples of massive galaxies are of 
tremendous value in addressing these types of questions. 
Spectroscopic redshifts are crucial for computing precise 
measurements of galaxy-clustering and galaxy-galaxy lens¬ 
ing which can be used to constrain the galaxy-halo connec¬ 
tion (e.g., Mandelbaum et al. 2006; Leauthaud et al. 2011; 
Coupon et al. 2015). The availability of spectroscopic red- 
shifts also reduces uncertainties on stellar mass estimates. 
Spectroscopic surveys such as zCOSMOS (Lilly et al. 2007), 
VVDS (Fevre et al. 2015), DEEP2 (Newman et al. 2013), 
PRIMUS (Coil et al. 2011), and VIPERS (Guzzo et al. 
2014), however, cover relatively small areas ranging from 
a few square degrees to a few tens of square degrees and 
do not provide statistically significant samples of the most 
massive galaxies (logjQ(M*/M 0 ) > 11.5). For this reason, 
we turn our attention instead to the Sloan Digital Sky Sur¬ 
vey III (SDSS-III, Eisenstein et al. 2011) Baryon Oscilla¬ 
tion Spectroscopic Survey (BOSS, Dawson et al. 2013). The 
main BOSS cosmological sample, the so-called CMASSsam¬ 
ple (Reid et al. 2016), includes roughly half a million massive 
galaxies at logj^Q(M,/M 0 ) > 11.0 at 0.43 < z < 0.70 and 
covers a gigantic volume of approximately 2.5 (h“^Gpc)® at 
the tenth data release (DRIO) (Ahn et al. 2014). This gigan¬ 
tic dataset enables high signal-to-noise ratio measurements 
of three dimensional galaxy clustering on large scales (typi¬ 
cally separation of r > 10 Mpc) and provides the most accu¬ 


rate measurement of the Baryon Acoustic Oscillation (BAO) 
scale and the Redshift-Space Distortion (RSD) signal with a 
precision in DRll (Alam et al. 2015) of « 1% and « 10% re¬ 
spectively (e.g., Anderson et al. 2014; Beutler et al. 2014a,b; 
Samushia et al. 2014). 

The main goal of this paper is to model the connec¬ 
tion between galaxy mass, color, and halo mass for massive 
galaxies using the BOSS CMASS dataset. In addition to 
providing insight on the evolution of massive galaxies, a de¬ 
tailed understanding of the CMASS-halo connection is also 
critical because BOSS analysis pipelines need to be system¬ 
atically tested against realistic CMASS mock catalogs. Mock 
catalogs within the BOSS collaboration (e.g.. White et al. 
2011; Manera et al. 2012; White et al. 2013; Kitaura et al. 
2013) are typically based on the Halo Occupation Distribu¬ 
tion (HOD) approach (see e.g., Berlind & Weinberg 2002; 
Zheng et al. 2005). However, until present, most studies have 
assumed that CMASS is a homogeneous sample and have ig¬ 
nored any redshift-dependent selection effects. 

Indeed, one difficulty with the CMASS sample that af¬ 
fects both studies of massive galaxies as well as the cre¬ 
ation of realistic mock catalogs, is accounting for the se¬ 
lection function of the sample. The CMASS selection al¬ 
gorithm was roughly designed to select a “constant stellar- 
mass" sample and is often quoted as being mass lim¬ 
ited at log]^g(M*/M 0 ) > 11.3 over the redshift range 
0.43 < z < 0.7. However, Leauthaud et al. (2015) (here¬ 
after, L15) demonstrate that CMASS is only 80% com¬ 
plete at logj^g(M*/M 0 ) > 11.6 in the narrow redshift range 
0.51 < z < 0.61. Our paper improves on previous analyses 
by presenting a careful treatment of the stellar mass com¬ 
pleteness of the CMASS sample in our models. 

To model the CMASS-halo connection we adopt the 
subhalo abundance matching (SHAM) technique. SHAM 
is a fairly simple and empirical approach which assumes 
that galaxy properties such as luminosity or stellar mass 
are monotonically related to (sub)halo properties such as 
mass or circular velocity (see e.g., Kravtsov et al. 2004; 
Vale & Ostriker 2004; Conroy et al. 2006; Moster et al. 
2010; Behroozi et al. 2010). Although there are model am¬ 
biguities in this method (e.g., in choosing which properties 
to relate and how scatter is introduced), SHAM requires rel¬ 
atively few parameters and also provides a straightforward 
prescription for linking galaxy properties to dark matter ha¬ 
los in numerical V-body simulations. Our work can be con¬ 
sidered as an update to Nuza et al. (2013) who used the 
SHAM approach to model the CMASS-halo connection but 
without accounting for the stellar mass completeness of the 
CMASS sample. 

In addition to the standard implementation of SHAM, 
we also explore the age matching model introduced by 
Hearin et al. (2013a) (hereafter, H13) which introduces 
galaxy color by assuming that at fixed stellar mass, redder 
galaxies reside in older sub-halos. The age matching scheme 
can qualitatively explain a variety of observed statistics in 
the SDSS main galaxy sample including color-dependent 
galaxy clustering (Hearin et al. 2013a; Watson et al. 2014), 
magnitude gap statistics in galaxy groups (Hearin et al. 
2013b), galaxy-galaxy lensing (Hearin et al. 2014b), galaxy 
conformity (Hearin et al. 2014a), and halo mass dependence 
of the specific star formation rate (Lim et al. 2015). 

Our models are constrained by three observables: the 
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clustering of CMASS on radial scales r < 0.1-10 Mpc, the 
galaxy stellar mass function (SMF), and the SMF of CMASS 
galaxies as a function of redshift. 

This paper is organized as follows. The observational 
data are summarized in Section 2. Our measurements of the 
correlation function and the galaxy stellar mass function are 
described in Section 3. In particular, Section 3.3 presents 
our measurements of the redshift-dependent CMASS SMFs 
that are an essential ingredient in this study. Section 4 briefly 
summarizes the simulated subhalo catalog. Section 5 is a de¬ 
tailed presentation of our SHAM and age matching method¬ 
ology. Our results are described in Section 6 and discussed 
in Section 7. Finally we summarize and conclude our study 
in Section 8. 

Our measurements assume a flat ACDM cosmology with 
Hm = 0.274 and Hq = 70kms“^ Mpc“^. For all quan¬ 
tities related to Wp, or to A^-body simulations, we adopt 
Ho = lOOh kms“^ Mpc“^ and hence distance and mass units 
are written as /i~^Mpc and /I'^M©. Note that there are 
small differences between this choice and the cosmological 
parameters assumed for the A-body simulations that we in¬ 
troduce in Section 4. 


2 OBSERVATIONAL DATA 

This section begins with a brief review of the BOSS DRIO 
CMASS sample. In addition to the BOSS sample, our analy¬ 
sis also relies on data from the SDSS Stripe 82 region which 
is roughly two magnitudes deeper than the SDSS main sur¬ 
vey. 

2.1 The BOSS DRIO CMASS sample 

The BOSS survey (Dawson et al. 2013) is a part of SDSS- 
III which measured 1.5 million spectroscopic redshifts of lu¬ 
minous galaxies and 160,000 quasars over an extragalactic 
footprint covering ~ 10000 deg^. Spectroscopic observations 
were obtained using the 1000 object fiber-fed BOSS spec¬ 
trograph (Smee et al. 2013) on the 2.5 m aperture Sloan 
Foundation Telescope at the Apache Point Observatory 
(Gunn et al. 1998, 2006). The BOSS pipeline is described in 
Bolton et al. (2012), and BOSS galaxies were selected from 
Data Release 8 (DR8, Aihara et al. 2011) ugriz photome¬ 
try (Fukugita et al. 1996). The main purpose of BOSS is to 
measure the BAO feature and RSD from galaxy clustering. 
The internal data release 11 (DRll) and the final DR12 
dataset are made public in Alam et al. (2015), although the 
DR12 large-scale structure CMASS catalog is not yet pub¬ 
licly available at this point. Using DRll which contains 
nearly one million spectroscopic redshifts of galaxies over 
~ 8,500deg^, the BOSS collaboration has measured BAO 
and RSD signals to an unprecedented precision of 1% and 
10%, respectively (e.g., Anderson et al. 2014; Beutler et al. 
2014a,b; Samushia et al. 2014) 

The BOSS target selection is divided into two samples, 
a low-redshift sample (“LOWZ”) that selects luminous red 
galaxies at 2 < 0.43 (for details see Tojeiro et al. 2014) and 
a high-redshift sample (“CMASS”) that targets galaxies at 
0.43 < 2 < 0.7 (Reid et al. 2016). This paper focuses only on 
the CMASS sample which is selected using a series of color- 
magnitude cuts motivated by stellar population models from 


Maraston et al. (2009). The CMASS sample is selected as: 

17.5 < icmod < 19.9, 
r'mod ^mod ^ 2.0, 
djo > 0.55, 

^fib2 ^ 21.5, 

*cmod < 19.86 -I- 1.6(dx-0.8), (1) 

where 

dj_ = T'mod ^mod (iimod r'mod)/8.0. (2) 

Model magnitudes are denoted with the subscript ‘mod’, 
composite model magnitudes are denoted with the subscript 
‘cmod’, fiber magnitude within a 2" aperture is denoted with 
the subscript ‘fib2’. The BOSS color cuts are computed using 
model magnitudes, whereas magnitude cuts are computed 
using cmodel magnitudes. All magnitudes are corrected for 
Galactic extinction using the dust maps of Schlegel et al. 
(1998). 

In this paper, we use the CMASS sample from the pub¬ 
lic DRIO dataset (Ahn et al. 2014) that includes 409,365 
galaxies over 4, 892deg^ in the North Galactic Cap (NGC) 
and 112,593 galaxies over l,432deg^ in the South Galactic 
Cap (SGC). Note that these numbers differ from those re¬ 
ported in Anderson et al. (2014) simply because we adopt 
a different weighting scheme for our clustering measure¬ 
ments (see following section). While previous studies have fo¬ 
cused on sub-samples of CMASS in limited redshift or mag¬ 
nitude ranges (e.g., Guo et al. 2013; Miyatake et al. 2013; 
Guo et al. 2014; More et al. 2014), in this paper we model 
the full CMASS sample over the full redshift range 0.43 < 
2 < 0.7. 


2.2 Stripe 82 Co-add Catalog Combined with 
UKIDDS Photometry For Improved Stellar 
Mass Estimates 

A key aspect of our approach is the use of Stripe 82 — a 
deeper but narrower subset of the survey area — for which 
it is possible to construct a galaxy sample with a well- 
understood completeness function. Stripe 82 provides two 
key advantages. First, it was the subject of repeat imag¬ 
ing campaigns in SDSS and therefore reaches ugriz depths 
that are roughly two magnitudes deeper than the single¬ 
epoch SDSS imaging that was used to construct the BOSS 
target catalog. This added depth is critical for obtaining re¬ 
liable photometric redshifts (photo-z’s) for massive galaxies 
(log^Q(M*/M0) > 11) that can be used to supplement the 
color-selected BOSS samples out to 2 ~ 0.7. Second, Stripe 
82 was imaged by the UKIRT Infrared Deep Sky Survey 
(UKIDSS, Lawrence et al. 2007) providing near-IR photom¬ 
etry for robust stellar mass estimates. 

In this paper, we use the Stripe 82 Massive Galaxy cata¬ 
log (hereafter, s82-mgc). The S82-MGC catalog construction, 
photometric matching, redshift validation, masking, and 
other details are described in Bundy et al. (2015). The S82- 
MGC catalog contains all classified galaxies from UKIDSS- 
LAS frames with lOu detection limits deeper than Y JHK — 
[20.2,20.2,20.2,20.6] (AB) (Oke & Gunn 1983). These lim¬ 
its are those roughly needed for lOcr detections in these 
bands of 2 ~ 0.6 passive galaxies with \og^p{Mt/M q) > 
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11.2. UKIDSS and BOSS masks are applied to this catalog 
which covers a total area of 139.4 deg^. 

The S82-MGC catalog contains both spectroscopic and 
photometric redshifts. For each galaxy, we adopt the spec¬ 
troscopic redshift when it is available and use the photomet¬ 
ric redshift otherwise. L15 demonstrate that the impact of 
photo-z scatter on the high mass end of the SMF is negligi¬ 
ble. Stellar masses are estimated for this catalog by applying 
the SED-fitting code described in Bundy et al. (2010) to the 
SDSS+UKIDSS PSF-matched photometry. For a prior grid 
of SED templates and a Chabrier IMF (Chabrier 2003), an 
M* probability distribution is obtained by scaling the model 
M/L ratios by the inferred luminosity in the observed K- 
band, or i^-band if a /sT-band magnitude is not available. 
The median of this distribution is taken as the M* estimate. 


3 CORRELATION FUNCTION AND STELLAR 
MASS FUNCTION MEASUREMENTS 

This section summarizes our measurements of several statis¬ 
tics derived from the observational data described in the 
previous section. After briefly explaining the measurement of 
the two-point correlation function (note that we use the mea¬ 
surement computed by Reid et al. (2014, hereafter R14)), we 
present our measurement of the CM ASS SMFs as a function 
of redshift. 

3.1 The CMASS Two-Point Correlation Function 

In this paper we adopt the DRIO projected two-point cor¬ 
relation function, Wp, and the monopole and quadrupole of 
the correlation function, and the associated covariance 
matrices determined by R14. We only give a brief summary 
of how these measurements were performed; we refer the 
reader to R14 for additional details. The two-dimensional 
redshift-space correlation function ^(s) is measured using 
the Landy-Szalay estimator Tandy & Szalay (1993): 

DD(As) - 2DR{As) + RR{As) 

RR{As) ’ 

where DD, DR, and RR are the data-data, data-random, 
and random-random pairs in a given bin [s —As/2, s-|-As/2]. 
The randoms account for the survey geometry and for the 
completeness factor which depends on angular position and 
a radial selection function, dn/dz. The correlation function 
is integrated over the line-of-sight separation to obtain the 
projected correlation function (Davis & Peebles 1983), 

TT ,max 

Wp{rp)=2 ^{rp,rn)drT,, (4) 

^0 

where the three-dimensional pair separation s in redshift 
space is split into a component transverse (vp) and paral¬ 
lel (vt^) to the line-of-sight direction. The integral is per¬ 
formed to r.,r,ma.x = 80/i“^Mpc and Wp is measured from 
0.194 h“^Mpc to 25.98 h“^Mpc with 18 equally spaced log¬ 
arithmic bins. The advantage of using the projected corre¬ 
lation function is that it is less sensitive than ^(s) to the 
effects of galaxy peculiar velocities. Note that, however, we 
do account for the RSD effect (van den Bosch et al. 2013) 
in our modeling through the velocity of subhalos. The pro¬ 
jected two-point correlation function is measured separately 


for the North and South Galactic Caps and these measure¬ 
ments are combined using a simple average, weighted by the 
number of CMASS galaxies in each hemisphere. 

The Wp measurement from R14 does not use the opti¬ 
mal weights (the so-called “FKP” weights), or the system¬ 
atic weights (Anderson et al. 2014). The systematic weights 
affect large scales and hence are not relevant for onr small- 
scale measurement. Also, this approach enables a fairer com¬ 
parison with our measurement of the galaxy SMF which does 
not use any weighting schemes. Weights are applied, how¬ 
ever, to account for redshift failures and for fiber collisions. 
Fiber collisions are particularly important for small scale 
clustering measurements with BOSS - the fiber-collision 
scale in BOSS is 62" which corresponds to a comoving scale 
of ~ 0.45/i“^Mpc at 2 ~ 0.57. To complicate matters, the 
BOSS tiling strategy also introduces a correlation between 
fiber collisions and the density field. R14 studied the im¬ 
pact of fiber collisions for the CMASS sample using tiled 
mock catalogs. They adopt a radial dependent correction 
scheme in which an angular up-weighting method is used at 
Tp < 1.09h“^Mpc and a nearest neighbor (NN) weighting 
scheme is used at larger scales. Finally, the correlation func¬ 
tion is debiased for residual fiber-collision effects using the 
tiled mock catalogs. 

The covariance matrix for Wp, Cu,p_boot, is derived from 
5,000,000 realizations drawn from 200 bootstrap regions 
which are roughly equal in size and shape. An additional 
10% uncertainty due to the angular up-weighting method 
and the debiasing procedure are propagated into the diago¬ 
nal element of the covariance matrix. As a result, the mea¬ 
surement error on Wp increases below Tp = 1.09/i“^Mpc. 
Finally, the inverse covariance matrix is corrected following 
Hartlap et al. (2007). With riboot = 200 and Ubm = 18, this 
leads to a 0.904 correction to the final inverse covariance 
m&trix, - 

In addition to Wp, we will also use the monopole and 
quadrupole of the correlation function which contain infor¬ 
mation about the peculiar velocities of galaxies. Again, fol¬ 
lowing R14, we adopt the pseudo multipole correlation func¬ 
tion defined by 

.. CMmax(s) 

ieis) = i2i + l) dfi^{s,fj.)£.eifi), (5) 

where -|- r^, /r = r^/s, and is the £-th order 

Legendre polynomial. The integration over the azimuthal 
angle /r is performed up to /rmax(s) = 0.534in order 
to minimize the impact of fiber collisions on the small-scale 
measurements. We refer the reader to R14 for further details. 

3.2 The Stripe 82 Stellar Mass Function at 

2 = 0.55 

As shown in L15, the CMASS sample is only stellar mass 
complete at the high mass end and in a narrow redshift 
range. To perform abundance matching, however, we need 
to measure the total SMF. Indeed, for abundance matching, 
a complete galaxy sample is necessary when rank ordering 
galaxies versus halos. 

Bundy et al. (2015) present an estimate of the SMF at 
2 ~ 0.5 by using the S82-MGC catalog. In order to compute 
the SMF, Bundy et al. (2015) use a combination of spectro¬ 
scopic redshifts, supplemented with photometric redshifts 
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(photo-z’s) when a spectroscopic redshift is not available. 
We adopt a similar approach and compute the SMF from 
the S82-MGC at \og-^Q{Mt / Mq) > 10.5 over 0.43 < 2 < 0.70. 
Our analysis assumes that the SMF does not vary over this 
redshift range. The result is shown in Figure 1. Error bars 
on the SMF represent the square root of the diagonal com¬ 
ponent of the covariance matrix, which is estimated from 
the data using 214 nearly-equal area bootstrap regions. 

Because the majority of galaxies at the high mass end 
have a spectroscopic redshift, the impact of photo-z uncer¬ 
tainty on the Stripe 82 SMF is negligible (see L15), i.e., the 
use of photometric redshifts only adds a negligible amount 
of scatter in the total stellar mass estimate and does not 
inflate the high mass end of the SMF. 

The left panel of Figure 1 presents a compar¬ 
ison between our SMF with results from COSMOS 
(Leauthaud et al. 2011) and PRIMUS (Moustakas et al. 
2013) at similar redshifts. Figure 1 demonstrates that, be¬ 
cause of the large area covered by Stripe 82, the high 
mass end of the total SMF is tightly constrained at 
logj^g(M*/M 0 ) > 11.3 over 0.43 < 2 < 0.70, while COSMOS 
and PRIMUS constrain the low mass end. The comparison 
with COSMOS and PRIMUS suggests that the S82-MGC is 
complete to logjQ(M*/M 0 ) ~ 11.2 at 2 = 0.7 (Bundy et al. 
2015). 

We will use the S82-MGG SMF measured using 8 data 
points over the range 11.5 ^ log]^g(M*/M 0 ) ^ 12.0. The 
inverse covariance matrix for the S82-MGC SMF, Cgj^p is 
computed as follows. First we compute the covariance ma¬ 
trix CsMF.boot from 214 bootstrap regions, and then smooth 
the noise in the non-diagonal components using a boxcar 
algorithm (Mandelbaum et al. 2006). Finally we multiply 
by the Hartlap correction factor which is ~ 0.958, i.e., 
^SMF = 0.958Cg,(jp Although the error budget is dom¬ 
inated by the Poisson noise which only contributes to diag¬ 
onal components (Smith 2012), the Poisson error underesti¬ 
mate the errors. We find that the diagonal component in our 
jackknife covariance matrix is larger than the Poisson errors 
by a factor of ~ 30% in the mass range of our interest. 


3.3 SMF of CMASS galaxies as a Function of 
Redshift 

The other ingredient that will be important in our analysis 
are the SMFs of CMASS galaxies as a function of redshift. 
The right panel of Figure 1 shows SMFs for CMASS galaxies 
measured using the S82-MGC in 7 redshift bins with A 2 = 
0.04. As can be seen from the right panel of Figure 1, the 
completeness of CMASS depends both on redshift and stellar 
mass; this is because the effects of the magnitude and color 
cuts that define the CMASS sample vary with redshift. The 
utility of the these CMASS SMFs will be apparent when we 
describe our methodology in Section 5. 


4 SUBHALO CATALOG 

In this section we present the A-body simulation and sub¬ 
halo catalog that is an essential ingredient in our abundance¬ 
matching study. We also perform tests of the completeness 
of the subhalo catalog. 


4.1 AT-body Simulation 

Because the BOSS DRIO CMASS sample covers a large 
comoving volume, V ~ 2.6 (/i“^Gpc)®, with a high num¬ 
ber density of n ~ 3 x 10“"^ (/i“^Mpc)“®, our analysis re¬ 
quires a large volume A-body simulation that can resolve 
halos to 10^^ Mq. We use the publicly available MultiDark 
simulation, MDRl (Prada et al. 2012; Riebe et al. 2013). 
The cosmological parameters in MDRl are consistent with 
a flat WMAP5 ACDM cosmology (Komatsu et al. 2009): 
Umo = 0.27, Ha = 0.73, Hbo = 0.047, = 0.95, and 

(Ts = 0.82. This cosmology is similar to the one used for our 
measurements of the clustering signals, therefore safely ig¬ 
nore the cosmological uncertainty in the distance scale (More 
2013). MDRl is a Lbox = 1.0h~^Gpc simulation with a par¬ 
ticle mass of 8.7 x 10® h~^MQ (Apar = 2048® particles). We 
use an output at 2 = 0.534 which is close to the the peak of 
the BOSS CMASS dn/dz at 2 ~ 0.55. 


4.2 Halo Catalogs and Merger Trees 

Halos and subhalos are identified using the Rockstar algo¬ 
rithm (Behroozi et al. 2013b, a). Rockstar is a phase-space 
halo finder that also considers halo merger histories to pro¬ 
vide a robust and stable identification of halos and sub¬ 
halos. The advantages of Rockstar compared to other halo 
hnders are well documented in Knebe et al. (2011) and 
Onions et al. (2012). These studies suggest that among halo 
finders, Rockstar finder is the least sensitive to resolution 
effects. Rockstar, with the Consistent Trees algorithm, pro¬ 
duces halo merger trees and catalogs with various parame¬ 
ters derived from the halo assembly history, including, for 
example, Upeak, the maximum halo circular velocity for each 
subhalo. 

We use the “Z”-axis of the MDRl simulation as the line- 
of-sight direction. In order to maximize the volume of our 
mock, we re-map the l-h“^Gpc MDRl cube into a cuboid 
of dimensions, (X,Y,Z) = (3.7417, 0.4082, 0.6547)Lbox fol¬ 
lowing the method developed by Carlson & White (2010). 
After remapping, the Z-axis has a length of 654.7 /i“^Mpc 
corresponding to a redshift range of 0.42 < 2 < 0.71. This 
includes a margin that is sufficient to account for peculiar 
velocities at the boundary of our mock catalog. 

Peculiar velocities of subhalos are dehned as the aver¬ 
age velocity of particles within the innermost 10 % of the 
virial radius. The virial overdensity in Rockstar is defined by 
Avir ~ 237pm at 2 = 0.534. This definition does not corre¬ 
spond to the definition of the halo bulk flow velocity that 
uses all particle members of the halo, because the halo core 
and its outer regions have different velocity structure. For a 
demonstration of this point, see Figure 11 of Behroozi et al. 
(2013b) and also Appendix B of R14.^ All subhalos are 
mapped into redshift space by including the peculiar ve¬ 
locity component along the Z direction before performing 
abundance matching. 


^ The definition of halo peculiar velocity in R14 is the average of 
particles within ~ 33% of the virial radius where Avir ~ 200pm. 
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Figure 1. [Left) The total SMF from Stripe 82 (black and grey squares) measured from S82-MCG (139.4 deg^) and the SMF measured 
using only CMASS galaxies (magenta squares). Other SMFs determined from smaller area surveys at similar redshifts are also shown. 
Red, blue, and green circles indicate results from PRIMUS (5.5 deg^) at 0.4 < z < 0.5, 0.5 < z < 0.65, and 0.65 < z < 0.8, respectively. 
Cyan triangles represent one wide redshift bin from the COSMOS survey (1.64 deg^). As explained later, we only use data points with 
logio(M*/M 0 ) > 11.5 (black squares) when fitting against the SMF data. (Right) SMFs as a function of redshift measured using only 
the CMASS sample. As a reference, we also present the total SMF from the S82-MGC at 0.43 < z < 0.70 and log]^Q(M*/M 0 ) > 10.5. As 
demonstrated in L15, the CMASS SMFs vary with redshift and CMASS is only complete in terms of stellar mass at the highest masses 
and in a relatively narrow redshift range. 


4.3 Time evolution and Resolution Tests 

In this section, we discuss potential issues in the subhalo cat¬ 
alog, focusing in particular on the time evolution of subhalo 
clustering and completeness issues due to the resolution of 
the simulation. Here we only summarize our findings - Fig¬ 
ures and further details can be found in Appendix. A. 

We adopt a single redshift output at z = 0.534 from 
the MDRl simulation. We test if a single redshift output is 
sufficient to model CMASS over the redshift range 0.43 < 
2 < 0.7. There are three outputs available in the MDRl sim¬ 
ulation over the redshift range of interest: z = 0.466, 0.534 
and 0.609. Using these redshift outputs, we find a difference 
in the real-space correlation function at fixed number den¬ 
sity, n ~ 1.58 X 10“^(/i“^Mpc)“®, at the 1-2 % level at 
large scales. The largest differences (at the level of 5%) are 
seen at the 1-halo to 2-halo regime at r < 1 h“^Mpc (see Ap¬ 
pendix. A). This level of evolution is below our measurement 
errors, but these effects will need to be taken into account in 
future work, especially when the S/N of the measurements 
increases (currently we are using DRIO measurements). 

We also perform two tests concerning the impact of the 
resolution of MDRl on our results. First, we determine if 
the subhalo catalog resolves the mass scale required for our 
abundance matching. Based on White et al. (2011) and R14, 
we estimate that abundance matching for CMASS will re¬ 
quire subhalos with Upeak ^ 200kms“^. Our tests demon¬ 
strate that MDRl resolves halos down to Upeak ~ 150 kms“^. 

Second, we examine the impact of resolution effects on 
the radial profiles of subhalos. Our estimates suggest that 
subhalo radial profiles become incomplete at 0.1-0.7 /i“^Mpc 
(and depend on the ratio between the peak velocity of 
hosts and subhalos). The smallest scale in our Wp measure¬ 
ment is « 0.2/i“^Mpc and is close to this incompleteness 
limit. The impact of resolution on our results is at least 
partly counteracted by the boost to the errors of our mea¬ 


sured Wp by systematic fiber-collision correction uncertain¬ 
ties on these scales. We conclude that the resolution of MDRl 
is sufficient for our purpose, but that recently-completed 
higher resolution simulations such as Skillman et al. (2014) 
or Ishiyama et al. (2014) would be preferable and will be 
adopted in subsequent work. 


5 METHODOLOGY 

Our goal is to find a model of the CMASS-halo connection 
which can simultaneously explain the SMF and the two- 
point correlation function and which also accounts for stel¬ 
lar mass completeness of CMASS. This section explains the 
details of our methodology. In this paper we only explore 
models that reproduce the projected two-point correlation 
function of the full CMASS sample over the redshift range of 
0.43 < 2 < 0.7. In future work we will explore how well our 
models match the clustering of sub-samples (e.g., dividing 
CMASS by color and redshift). 

We begin with a broad overview of our global method¬ 
ology and the two classes of models explored in this paper. 
The details of our approach are then provided in the later 
half of this section. The casual reader may wish to read the 
overview of the methodology and then skip directly to the 
summary provided in Section 5.6. 

5.1 Overview of Methodology and Models 

Our approach is based on the SHAM framework for connect¬ 
ing galaxies and dark matter halos (see Section 5.2). Within 
the context of SHAM, we will explore two broad classes of 
models that relate galaxy color to halo properties. The first 
model that we explore is a “stochastic model” in which at 
fixed stellar mass, galaxy color in high-mass halos is simply 
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a random process that does not correlate with halo proper¬ 
ties. We will refer to this model as the “AbM” model. Af¬ 
ter abundance matching our mock catalog, we tag CMASS 
galaxies by randomly down-sampling the full mock galaxy 
catalog in such a way that the mock CMASS SMFs repro¬ 
duce the ones measured in Section 3.3. Unless an additional 
correlation between this CMASS flag and halo properties is 
explicitly introduced, this procedure makes the implicit as¬ 
sumption that at fixed stellar mass, CMASS galaxies are a 
random sample of the overall population. However, L15 show 
that at fixed stellar mass, CMASS is not a random sample 
of the overall population in terms of galaxy color. Hence, 
the abundance matched catalog that we obtain after the 
down-sampling procedure will only correctly represent the 
true relation between galaxy color, stellar mass, and halo 
properties if color is a random process at fixed stellar mass. 

The second model is an extension to the traditional 
abundance matching scheme introduced by H13 called age 
matching. This model is based on the premise that galaxy 
color correlates with a secondary halo property at fixed stel¬ 
lar mass (see Section 5.3). After first abundance match¬ 
ing our mock catalog, the age-matched model will be built 
by re-shuffling CMASS galaxies according to a secondary 
halo property. In order to fully implement the age-matching 
model, however, we need to characterize the color distribu¬ 
tions of galaxies from the S82-MGC as a function of mass 
and redshift and also to understand the effects of scatter in¬ 
troduced in these color distributions from photometric red- 
shifts. This is a non-trivial task that we defer to Paper H - 
opting here instead to simply perform a qualitative investi¬ 
gation of the effects of age matching on the two-point cor¬ 
relation function. For this purpose, we will adopt a simple 
color model for the galaxy population that is based on a 
“color-rank distribution” represented by Acoi which effec¬ 
tively characterizes the color ranking of the CMASS ver¬ 
sus other galaxies. This distribution is characterized by one 
free parameter called /icMASS- As described in §5.3, this pa¬ 
rameter controls the correlation strength between subhalo 
properties and the CMASS selection function. 


5.2 Subhalo Abundance Matching 

The SHAM scheme provides an effective and simple way 
to model the galaxy-halo relation and has been success¬ 
ful at modeling both the galaxy stellar mass function as 
well as the galaxy two-point correlation function (see e.g., 
Kravtsov et al. 2004; Vale & Ostriker 2004; Conroy et al. 
2006; Moster et al. 2010; Behroozi et al. 2010). The basic 
philosophy of SHAM is that massive (sub) halos host bright 
galaxies. More concretely, the SHAM method begins by rank 
ordering galaxies by stellar mass M* (or luminosity). Halos 
drawn from A-body simulations are rank ordered by peak 
maximum circular velocity Vpeak- Galaxies are then assigned 
to subhalos in descending order such that ngai(> M«) = 
Uhaio(> Upeak). lu practice, there are multiple ambiguities 
in the SHAM technique. First, there is freedom in choos¬ 
ing how to rank order subhalos. For example, Reddick et al. 
(2013) showed how the predicted two point correlation func¬ 
tion varies when rank ordering is performed using different 
halo mass proxies such as halo mass Mvir, maximum circu¬ 
lar velocity Ucirc, and its peak over entire merging history, 
Upeak- Motivated by this work, we will evaluate how our 


model varies when rank ordering by either Upeak or Mpeak- 
Second, SHAM models must also account for scatter between 
galaxy properties and halo properties. We account for scat¬ 
ter by adopting the methodology of Behroozi et al. (2010) 
and Reddick et al. (2013). 

To perform abundance matching, we need to evaluate 
the total SMF over the entire mass range covered by the 
CMASS sample, i.e., down to logjo(Af*/M 0 ) ~ 10.6. This 
value is below the completeness limit of the S82-MGC. Our 
strategy will be to fit the total SMF from the S82-MGC in 
the range logjQ(M*/MQ) > 11.5 using a double Schechter 
function (Baldry et al. 2008): 


((>(M*; 01, ai, <() 2 , ( 22 , Mo 


]^g{“i+l)(log M,-log Mo) 


-b 0210 <“"^ j (In 10) exp 


M« 

Mo 


, ( 6 ) 


where |q: 2 | > |ai| and the second term dominate at 

the low-mass end. The amplitude of the SMF below 
logio(M*/MQ) = 11.5 is unconstrained by the S82-MGC 
SMF but will be adjusted by our joint fit to the cluster¬ 
ing of CMASS galaxies. Section 6 shows that our joint fit 
to the S82-MGC SMF and to Wp yields a SMF that is con¬ 
sistent at the low-mass end with results from PRIMUS and 
COSMOS. 

We abundance match subhalos against this SMF, and 
convolve it with a uniform log-normal scatter. 


/ 


(M*;0i,ai,02,a2,Mo,o-) 

(m — log M*)^ 


dm ^ exp 


2(t2 


(7) 


which introduces a scatter in the relation between stellar and 
halo mass. This scatter arises due to a combination of in¬ 
trinsic scatter in the stellar-to-halo-mass relation and errors 
associated with stellar mass measurements (Behroozi et al. 
2010; Leauthaud et al. 2011). Hence, for a realistic model, 
the value of a must be equal to, or greater than, the measure¬ 
ment errors in stellar mass measurements - we will return 
to this question in Section 6 . 

We fit the S82-MGG SMF over 8 data points at 11.5 ^ 
logiQ{Mt/M q) ^ 12.0. Our SMF measurements probe the 
high mass end of the stellar mass function and hence are 
insensitive to some parameters in the double Schechter func¬ 
tion. For this reason, in our fits, we simply fix the param¬ 
eters that is not sensitive to the very high mass end to 
( 01 , 02 , 02 ) = (—0.46,3.0 X lO”'^, — 1.58). This is motivated 
by results at the low-mass end from Baldry et al. (2008). 
In summary, our abundance matching model contains three 
free parameters, 0i, Mq, and a. 

We compute a for the S82-MGC SMF as follows: 


XSMF — ^^[0meas(Af*^l) 0conv (A/* ; 01 5 A/q , ( 2 )] 

ij 

^ Cg]y[p [</- meas (M. 

,j ) 0conv (M*j;0i,Mo,(j)l8) 

where 0conv(M*; 0i, Mo, a) is the theoretical SMF predicted 
by Equation (7). 


5.3 Subhalo Age Matching 

SHAM essentially specifies the stellar-to-halo mass relation 
between galaxies and halos. It is normally assumed that 
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halo mass is the primary variable on which the galaxy- 
halo connection depends. However, in addition to halo 
mass, halo clustering also depends on other parameters 
such as halo age, a phenomenon known as assembly bias 
(see e.g., Gao et al. 2005; Wechsler et al. 2006; Jing et al. 
2007; Gao & White 2007; Dalai et al. 2008; Li et al. 2008; 
Lin et al. 2015; Miyatake et al. 2015). 

H13 introduced an extension to the traditional abun¬ 
dance matching scheme called age matching which can re¬ 
produce the color-dependent clustering of the SDSS main 
galaxy sample (also see Masaki et al. 2013). This method 
matches galaxies and halos using both stellar mass as well 
as galaxy color. The basic premise of the approach is that 
redder galaxies are assigned to older subhalos at fixed stellar 
mass. 

In the age matching scheme, each halo is assigned a 
characteristic redshift (^starve) computed from halo merger 
trees. This Zstarve parameter is defined as the maximum of 
three distinct age components: 

• Zchar: the earliest redshift at which the most massive 
progenitor of a given subhalo exceeds Mu > 10 ^^ h~^M q. 
For subhalos less massive than 10^^ h~^MQ, Zchm = ^obs- 

• Za,cc'- the epoch when a subhalo accretes onto a host 
halo. For host halos, Zacc = ^obs- 

• Zforui- the epoch dehned by ztorm = Cvir/(4.1aacc) - 1, 
motivated by the fact that there is a tight correlation be¬ 
tween the concentration parameter and the epoch when 
halo growth transits from a fast to slow accretion regime 
(Wechsler et al. 2006). Note that aacc = 1/(1 + 2 acc). 

We adopt Zobs = 0.534 while in the original work of H13, 

^obs 0 . 

There is a critical difference between this work and H13: 
our relevant mass regime (logjQ(M*/M 0 ) > 11 ) is much 
higher than that of H13 (logjQ(M*/M 0 ) < 11). H13 found 
that ztoim is the dominant component of Zstarve for the SDSS 
main sample whereas we find that 2 char is the dominant com¬ 
ponent for CM ASS (see Section 6 ). This is in keeping with 
the results shown in Figure 5 of Hearin et al. (2014b), which 
demonstrates that Zchar begins to dominate the contribution 
to ^starve for Stellar masses logj^g(M*/M 0 ) > 11.5 at 2 ~ 0. 
In our CMASS sample, these higher-mass galaxies dominate 
the sample, whereas the Main Galaxy Sample is dominated 
by lower-mass galaxies. Because of these key differences, the 
impact of assembly bias in our models will be qualitatively 
different compared to H13 (see Section 6 ). 

In Paper II we will use the actual color distributions 
of massive galaxies as a function of redshift to perform age 
matching. Our goal in this paper, however, is to perform a 
hrst qualitative analysis of the general effects of age match¬ 
ing above collapse mass, a regime that has not yet been fully 
investigated. For this purpose, we introduce a simple color- 
rank distribution denoted Acoi. This color-rank distribution 
will be used to assign “colors” to CMASS and to non-CMASS 
galaxies and to perform the color-based rank ordering in the 
age matching scheme. Our goal is to construct a model that 
allows for a simple “mixing” between these two populations. 

Operationally, we accomplish this mixing with our age 
matching model as follows. First, at each stellar mass we 
generate a random distribution of X^oi values. Suppose there 
are Ah subhalos in the stellar mass bin, and that the fraction 
of galaxies of this stellar mass that are CMASS-selected is 


denoted by /cmass ■ We then draw /cmass x Ah values from 
a Gaussian distribution of mean /tcmass and unit variance; 
these draws will be the “colors” Acoi of our mock CMASS 
galaxies. We next draw (1 — /cmass) x Ah values from a 
Gaussian distribution of zero mean and unit variance; these 
draws will be the “colors” Acoi of our non-CMASS galax¬ 
ies. We then rank-order the joint collection of the randomly 
drawn values of Acoi. Subhalos in the same stellar mass bin 
are rank-ordered by ^starve. In monotonic fashion, the larger 
Acoi draws are assigned to the subhalos with larger Zstarve 
values, and the CMASS-designation associated with Acoi is 
also assigned to the subhalo, dehning the CMASS selection 
function in the “AgM” model. 

The value of /tcmass determines the strength of the cor¬ 
relation between the CMASS selection function and subhalo 
Zstarve at fixed Stellar mass. If /tcmass is large (for instance, 
/rcMASS = 10), then Acoi—values with a CMASS-designation 
are always larger than Acoi—values attached to non-CMASS 
draws, in which case at fixed stellar mass, subhalos with 
the highest ^starve are always selected to be CMASS galax¬ 
ies. On the other hand, if /xcmass = 0, the Acoi distribu¬ 
tions of CMASS and non-CMASS draws are identical, so in 
this case matching the Acoi and ^starve distributions has no 
impact on the CMASS designation assigned to the subha¬ 
los, and the CMASS selection function is uncorrelated with 
-^starve at fixed Stellar mass. Finally, for intermediate values 
of /rcMASS (for instance, /tcmass = 0.6), then CMASS and 
non-CMASS galaxies have overlapping Acoi distributions, 
but CMASS galaxies are “redder” on average. Figure 2 illus¬ 
trates these concepts. 

In our analysis, /tcmass is left as a free parameter which 
means that we determine the degree to which CMASS colors 
overlap with non CMASS galaxies directly from the data. 
We do not however currently account for any redshift and 
stellar-mass dependence of pcMASS, thus we do not account 
for any redshift and stellar-mass dependence of the CMASS 
color-cuts. This is a limitation of our current model, the 
importance of which will become clearer in Section 6.3. 

5.4 Accounting for the Stellar Mass Completeness 
of CMASS as a Function of Redshift 

We assume a single global SMF over the CMASS redshift 
range. For each set of parameters {4>i, Mo, a), we create a 
mock catalog via abundance matching. At this point galax¬ 
ies in the mock catalog have redshifts and stellar masses. 
The next step is to tag CMASS galaxies in the abundance- 
matched mock catalog as a function of redshift. We divide 
our simulation into seven redshift bins along the Z direc¬ 
tion (the bin width is Az = 0.04). The redshift width of 
Az = 0.04 is conservative and this choice is motivated by 
the uncertainty of photometric redshift estimation in the 
S82-MGC catalog. 

The CMASS SMF varies as a function of redshift as a 
result of the BOSS selection function. In Section 3.3, we 
used the S82-MGC catalog to measure the number densi¬ 
ties of CMASS galaxies as a function of mass and redshift, 
{M,, z). Because we assume that the total SMF 
does not vary over our redshift baseline, we can compute how 
many CMASS galaxies are expected for every redshift slice 
in the mock catalog simply by scaling this number by the 
ratio of volume in the redshift slice in the mock (A14im(2)) 
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to the S82-MGC volume (AVs82(2)): 

(M*, z) = (M*, z). (9) 

A\/s82(Zj 

In order to predict the number of mock galaxies as a function 
of mass and redshift, we construct bins in stellar mass from 
10.6 to 12.3 dex with A log M* = 0.05. We have checked that 
our prediction is stable with AlogM, = 0.1. In the mock 
catalog, we randomly tag NcMASsiMt, z) galaxies with a 
CMASS flag. For a small number of bins, (M*, z) 

exceeds the number predicted by the total SMF (simply due 
to sample variance). In this case, we simply set = 

Following this procedure, every galaxy in our mock 
catalog is now assigned a stellar mass, a redshift, and a 
flag that indicates mock CMASS galaxies. By design, mock 
CMASS galaxies have stellar mass distributions that match 
the ones measured in Section 3.3. 

5.5 Predicting the CMASS Two-Point 
Correlation Function 

We now have a mock catalog that contains galaxies with 
three dimensional positions and with a flag that indicates 
CMASS galaxies. The next step is to compute the predicted 
the CMASS two-point correlation function. rCp, theory is com¬ 
puted from the mock following the exact same procedure as 
for the BOSS DRIO data. To account for the finite volume of 
the simulation, we compute a covariance matrix for Wp,theory 
(referred to as C^p^theory), which is estimated via jack-knife 
by dividing the {X, y)-plane into 256 equal regions. For the 
small scales of concern in this paper, jack-knife errors out¬ 
perform bootstrap errors (P. Norberg, private communica¬ 
tion, Arnalte-Mur & Norberg et ah, in prep). 

The fitting for Wp is performed with 

~ ^ ^ (^p,ii 011 Afo, Aujp (rpj'; <()i, Mq, <t), 

( 10 ) 

where Ap,p (rp,i; ())i, Mq, cr) = Wp,meas(rp,i) - 
Wp, theory (rp,i; 01, Mq, O'), and the total covariance ma¬ 
trix includes uncertainties in both measurement and our 
theory estimates, i.e., Cu,p,total = C rPp,meas A C^pp,theory. 

5.6 Summary of Methodology 

Figure 3 presents an illustration of our methodology for the 
AbM model. A summary of our methodology is as follows: 

(i) Start with a set of SMF parameter values. For the stochas¬ 
tic (“AbM”) model, the parameters are (0 i,Mo,ct). For the 
age-matching (“AgM”) model there is an additional parame¬ 
ter, /rcMASS . The parameter ^cmass only impacts the mod¬ 
eling of two-point statistics such as Wp - one-point statistics 
such as the SMF are entirely unaffected by /tcmass- 

(ii) The two parameters 0i and Mo control the total SMF 
(without scatter). The total SMF including scatter is ob¬ 
tained analytically following Equation (7). A x|mf is com¬ 
puted between this analytic model and the total stellar mass 
function estimated in Section 3.2. 

(iii) In parallel, we generate a mock catalog to model Wp. The 
first step in generating this mock catalog is to abundance 



Figure 2. Illustrative figure of the color-rank distributions for 
CMASS and non-CMASS galaxies. The X^oi “colors” of non- 
CMASS galaxies are drawn from a normal distribution with unit 
variance and zero mean (shown by the solid blue line). The Appi 
“colors” of CMASS galaxies are drawn from a normal distribution 
with unit variance and with a mean value equal to /rcMASS • When 
A^CMASS = 0.599 (dashed red line), CMASS and non-CMASS 
galaxies have overlapping color distributions but CMASS is red¬ 
der on average. When pcMASS = 10 (solid red line), all CMASS 
galaxies are redder than non-CMASS galaxies (this situation cor¬ 
responds to the extreme age-matching case explored in Section 
6.3). Our best-fitting value for pcMASS i® 0.599 and corresponds 
to the distribution shown by the dashed red line. 


match the mock catalog using the same total SMF (without 
scatter) as in the previous step. We test abundance match¬ 
ing both in terms of Fpeak and Mpeak. Scatter (a) is intro¬ 
duced into stellar mass in the mock catalog at fixed Vpeak 
(or Mpeak). We have checked that the mock catalog is large 
enough that stochasticity due to the rare number of high 
mass halos is a negligible effect, i.e., the Poisson error in the 
measured mass function dominates the error budget at high 
stellar masses. 

(iv) The Z direction of the mock is taken as the redshift axis. 
Mock CMASS galaxies are tagged in the mock catalog by 
down-sampling the overall population in redshift and stellar 
mass in order to reproduce the CMASS SMFs measured in 
Section 3.3. 

(v) At this stage, mock CMASS galaxies are simply a random 
sub-sample of the overall population; this mock corresponds 
to our stochastic “AbM” model. 

(vi) For the age-matching model, we begin by assigning the 
subhalos a color-rank, Acoi, as follows. For subhalos host¬ 
ing a non-CMASS galaxy, Acoi is drawn from a Gaussian 
distribution with zero mean and unit variance. For subhalos 
hosting a CMASS galaxy, Acoi is drawn from a Gaussian 
distribution with mean /rcMASS and unit variance. At fixed 
stellar mass, the random Acoi values are rank-ordered. At 
the same stellar mass, the mock galaxies are rank-ordered 
by a secondary halo property, where we choose ^starve as this 
secondary parameter in our fiducial model, where Zstarve is 
concretely defined in §5.3. In monotonic fashion, the sub¬ 
halos with the largest ^starve values are assigned the largest 
Acoi values, and the GMASS/non-GMASS designation asso- 
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Figure 3. Illustration of our overall methodology for constraining the AbM model and creating a mock CMASS catalog. The stochastic 
AbM model contains three free parameters: {rpi, Mo,a). The age-matching (AgM) model contains one additional parameter, /rcMASSi 
which controls how strongly CMASS galaxies correlate with ^starve at fixed Vpeak- 


dated with each Xcoi value is also assigned to the subhalo. ^ 
Thus subhalos in the “AbM” and “AgM” mocks in general 
have different CMASS-designations: in “AgM”, the CMASS- 
designation is correlated with Zstarve at fixed Fpeak, with the 
correlation strength governed by our /tcmass parameter, 
(vii) Generate a random catalog that follows the CMASS dn/dz 
and compute Wp,theory Note that Ctheory is fixed using our 
best-fitting parameters (after a first initial iteration). 

(viii) Compute Xwp between Wp,meas and Wp,theory, and then add 

a-s = xImf + Xwp • 

(ix) Iterate this procedure. 

The best-fit parameters and errors are determined us¬ 
ing the Markov chain Monte Carlo (MCMC) technique. We 
use a modified version of COSMOMC (Lewis & Bridle 2002) 
that has been well tested in previous work (Saito et al. 2011; 
Zhao et al. 2013; Saito et al. 2014). Since our SMF is es¬ 
timated from S82-MGC and the correlation function Wp is 
computed over the full DRIO footprint, the cross correlation 
between these two statistics are negligible. The x^ for the 
multipoles is defined in a similar way. The correlation be¬ 
tween the monopole and the quadrupole is properly taken 
into account by the covariance matrix. 

6 RESULTS 

6.1 Abundance Matching 

We now perform a joint fit to the SMF and to Wp. 
The left panel of Figure 4 presents our best-fit to the 
SMF using a double Schechter function and abundance¬ 
matching against Fpeak- The best-fit parameters for the dou- 

^ For certain tests, we may rank-order the subhalos only accord¬ 
ing to Zform or ^char (Section 5.3). 


ble Schechter function are: ((()i, logjp Mo, cr) = (1.86tQ;ej ^ 
10■^10.89^2;g^,0.105^^;g32) with xImf = 4.55. Errors are 
reported with a 68% confidence level. We find excellent fits 
to both the SMF and Wp with two specific points worth high¬ 
lighting. First, the amplitude of our best fit SMF agrees well 
with COSMOS and PRIMUS at log^Q M* > 11.0 but has a 
lower amplitude at log^Q M* < 11.0. Because the number 
density of CMASS drops sharply below this mass scale, we 
simply do not expect to constrain the total SMF in this 
range. Second, the best fit value for the scatter is lower 
than our naive expectation. Indeed, a should include con¬ 
tributions from measurements errors as well as from intrin¬ 
sic scatter in the stellar-to-halo mass relation. The average 
uncertainty in stellar mass measurements from the S82-MGC 
is of order Umeas ~ 0.1 dex in this mass and redshift range. 
Hence, a value of a = 0.105 implies a very small intrinsic 
scatter in the stellar-to-halo mass relation. We will return 
to this point in the discussion section. 

The right hand panel of Figure 4 presents our best-fit 
to Wp as the red line (xSip = 11-43). The goodness of fit in 
this case is xV(d.o.f.) = (4.55-b 11.77)/(8-b 18 - 3) =0.710. 
We have also tested abundance matching against Mpeak in¬ 
stead of Upeak- The blue line shows the results of abundance 
matching against Mpeak using the same best-fitting SMF 
parameters as above. As can be seen from Figure 4, Upeak 
yields a larger clustering amplitude and is more consistent 
with the BOSS data than Mpeak- 

There are two factors which lead to the differences in 
these clustering predictions. First, host halos selected by 
Mpeak cluster more weakly relative to host halos selected 
by Vpeak, as discussed in Zentner et al. (2014). This effect 
is small in our halo mass range, which we have verified 
by comparing the central-central pair counts between the 
two models, which are nearly identical. Second, the satel¬ 
lite fractions predicted by the two models are different: our 
Vpeak—based SHAM model has a larger satellite fraction rel- 
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ative to the Mpeak—based model. Indeed, at fixed Mpeak, 
subhalos have larger Vpeak than host halos (see Figure 2 
in Reddick et al. 2013), which suggests that rank-ordering 
with V^eak results in similar clustering of central galaxies 
but the larger satellite fraction boosts the overall clustering 
amplitude. We adopt Vpaak as our fiducial model and do not 
explore abundance matching with Mpeak any further. 

6.2 CMASS dn/dz 

Figure 5 presents a comparison between the redshift distri¬ 
bution of CMASS galaxies from our best-fitting mock cata¬ 
log with the redshift distribution of CMASS galaxies in the 
S82-MGC and from the full BOSS DRIO SGC. Our mock re¬ 
produces the CMASS dn/dz from the S82-MGC catalog and 
is consistent with dn/dz from the BOSS DRIO SGC. The 
amplitude differences between the dn/dz from our mock and 
the DRIO dn/dz are due to sample variance. In our current 
methodology, the sample variance introduced by matching 
the CMASS SMFs from Stripe 82 is not taken into account 
which is a limitation of our current approach. This reflects a 
trade-off made to take advantage of the higher quality stellar 
mass estimates from the S82-MGC, but doing so, our current 
analysis is also limited by the sample variance from Stripe 
82. 


6.3 Gaining an Intuition for Age Matching Above 
Collapse Mass 

In the previous section, we showed that a reasonable fit to 
and Wp can be achieved using a simple abundance match¬ 
ing scheme in which galaxy color in high mass halos is sim¬ 
ply a stochastic process. We now investigate whether or not 
models in which galaxy color correlates with halo assem¬ 
bly properties can achieve comparable results. One caution 
worth mentioning here with respect to the age matching 
model is that, unlike in H13, the combination of the steep 
Vpeak-M, relation and the non-zero scatter in this relation 
leads to a difference in the mean host halo mass compared 
to the standard abundance matching model (see Appendix. 
B for details). 

First, we wish to develop some intuition for how the 
different components of Zstarve affect Wp in this high halo 
mass regime. Figure 6 shows that CMASS galaxies are firmly 
in halo masses above collapse mass, Mcoi(z = 0.534) = 
^“^Mq. Hence, the behavior of Zstarve may be fun¬ 
damentally different compared to previous work by H13. 
Let us begin by considering an extreme case in which 
CMASS galaxies are all redder than non-CMASS galaxies 
(Acoi, CMASS » Alcoi,others. See sofid lines in Figure 2). For 
this test, we adopt the values of the best fit to from Fig¬ 
ure 4 . At fixed stellar mass, we rank order galaxies accord¬ 
ing to Zstarve, Zform, Or Zchar. The results of tliis extreme 
case are presented in Figure 7. Interestingly, but perhaps 
not surprisingly, we find that Zform (blue curve) lowers the 
clustering amplitude. This is because Zform is defined using 
halo concentrations and the effects of assembly bias have an 
opposite effect above and below collapse mass when using 
the concentration parameter (see e.g. Wechsler et al. 2006; 
Dalai et al. 2008). Thus, in this high-mass regime, Zform 
causes red galaxies to cluster less strongly than blue ones. 


Let us now turn our attention to Zctar- Interestingly, 
rank ordering according to 2char produces the opposite effect 
and causes an increase in the clustering amplitude. Previous 
work on assembly bias has shown the switch in the assem¬ 
bly bias effect seen when considering halo concentration is 
not always reflected when considering other halo parameters. 
Previous work has not studied the specific case of ^starve; 
however, Jing et al. (2007) and Li et al. (2008) report that 
when an age parameter based on a fixed mass threshold such 
as «i /2 is used where Zi /2 denotes the redshift when a halo 
acquires half of the final mass at the observational time, a 
similar behavior is observed (see Figure 4 in Li et al. 2008). 

Finally, let us now examine the Zstarve component, which 
includes contributions from both Zchm and Zform. The pre¬ 
diction for Zstarve fies between the Zform and the Zchar cases 
but is closer to Zchar than to ztorm. This is because in this 
mass regime, Zstarve is dominated by Zchar not Zform (see Fig¬ 
ure 8). Thus the impact of the assembly bias for CMASS is 
qualitatively distinct from the trends identified by H13 in 
lower mass halos, a fact which traces to the change in char¬ 
acter of assembly bias for halos above and below collapse 
mass. 

The dashed curves in Figure 7 display lUp for central 
galaxies only - demonstrating that the trends discussed 
above are not simply due to varying satellite fractions. 


6.4 Fit to Wp with an Age Matching Type Model 

Of course, the true differences between the color distribu¬ 
tions of CMASS galaxies compared to non-CMASS galaxies 
of similar mass are not as extreme as the case explored in 
the previous section. As discussed in section 5.1, the imple¬ 
mentation of age-matching first requires a characterization 
of the color distributions of galaxies from the S82-MGC as 
a function of mass and redshift and also requires modeling 
the effects of scatter introduced in these color distributions 
from photometric redshifts. This is an aspect that we de¬ 
fer to Paper 11. Here, we perform a qualitative investigation 
of the effects of age matching on the two-point correlation 
function using the color-rank variable /tcmass. 

We now perform a joint fit to the SMF and to Wp 
in which /xcmass is left as a free parameter (the “AgM” 
model). The results are presented in Figure 9. The best- 
fit parameters are (^i, log^g Mq, n, /tcmass) = (2.5llo75 x 
10-3,10.83int.0.136l°;i,0.599t°;«5) with xImf = 4.09 
and Xtop = 10.75. The AbM model and the AgM model yield 
a comparable goodness of fit (Ax^ = (4.55 -f 11.77)/(26 — 
3) = 0.710 for the AbM model and Ax^ = (4.09 -f 
10.75)/(26 —4) = 0.674 for the AgM model). There are three 
points worth highlighting concerning the results of the AgM 
model. First, the best-fit SMF has a slightly higher ampli¬ 
tude at low stellar masses compared to the AbM model and 
is in better agreement with PRIMUS and COSMOS. Sec¬ 
ond, the best-fit value for the scatter is larger than the AbM 
model (cr = 0.136 versus a = 0.105), which leaves a larger 
margin for intrinsic scatter. Finally, the best-fit value for 
MCMASS of 0.599 corresponds to a scenario in which CMASS 
and non-CMASS galaxies have overlapping color distribu¬ 
tions, but with CMASS galaxies being somewhat redder on 
average. Reassuringly, this result matches our qualitative ex¬ 
pectations for this sample. 
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The AgM model explored here is simplistic in the sense 
that we have used a single value of /tcmass over a whole 
CMASS redshift range, whereas the true color distribution 
of CMASS versus other galaxies depends on redshift. A more 
sophisticated model which accounts for this effect will be 
presented in a forthcoming paper. 

7 DISCUSSION 

7.1 HOD Modeling in the Context of Complex 
Samples such as CMASS 

Both HOD and SHAM are popular methods for modeling 
the SMF and the galaxy-two-point correlation functions. 
One reason that HOD methods are popular is that they pro¬ 
vide a relatively simple framework that can also be used to 
rapidly model a variety of observables. However, one of the 
downsides of this method is that specific functional forms 
must be assumed for the central and satellite occupation 
functions. These assumptions may be robust for volume- 
limited threshold samples such as those commonly studied 
in the SDSS main samples (e.g., Zehavi et al. 2011). How¬ 
ever, it is less clear if these types of methods can be applied 
to samples such as CMASS which are selected via complex 
color and luminosity cuts and for which both the shape and 
normalization of the effective HOD may vary with redshift. 

There have been several attempts to model the CMASS- 
halo connection on the basis of HOD type models. Among 
these studies, Guo et al. (2013, 2014) and More et al. 
(2014) focused on specific sub-samples of CMASS, whereas 
White et al. (2011) and R14 used a HOD type model to de¬ 
scribe the clustering of the full CMASS, assuming no redshift 
evolution in the HOD. 

In this paper, we have introduced a novel SHAM-based 
method^ that can be used to model complex populations 
such as CMASS by accounting for the mass completeness 
of the sample as a function of redshift. We explore a first 
qualitative approach for also considering color completeness 
which will be developed further in Paper II. We now inves¬ 
tigate what these models predict in terms of the redshift 
dependence of the CMASS HOD. The right panel of Figure 
1 shows that the SMF of CMASS varies strongly with red¬ 
shift. This figure alone suggests that the HOD of the CMASS 
sample is not likely to be uniform over the CMASS redshift 
range. 

Figure 10 presents the HODs predicted from our AbM 
and AgM mock catalogs as a function of redshift. As a 
comparison we also display the HOD from R14, which as¬ 
sumes no redshift evolution. R14 fit the clustering assum¬ 
ing a constant number density with a derived value of 
n = (4.12 ± 0.13) X 10“"^ (/i“^Mpc)“® (see Figure 5) under 
the assumption that the CMASS dn/dz can be obtained by 
simply down-sampling the best-fit HOD as a function of red¬ 
shift. We down-sample the R14 HOD to match the CMASS 
dn/dz and present the results in Figure 10. 

There are several noteworthy differences between our 
HODs and the single non-evolving one from R14. At the 

® After this paper was submitted, a parallel effort was brought to 
our attention which adopts a similar methodology as our paper 
(Rodriguez-Torres et al. 2015). 


lowest redshift bin, 2 = 0.445, the mean occupation for cen¬ 
tral galaxies does not approach unity due to incompleteness 
in the SMF at high mass end (see the magenta curve in the 
right panel of Figure 1). At 2 = 0.565, which corresponds to 
the peak of the CMASS dn/dz, our HOD is more similar to 
R14, but there is still a discrepancy in the shape of (Acen), 
especially at the low mass end. The largest differences are at 
2 > 0.6. Our HODs converge to unity at large halo masses 
whereas the down-sampled R14 one converges to Ntot ^ 0.1; 
this is due to the stellar mass completeness of CMASS. This 
difference arises, because, in our models, the decline of the 
CMASS number density above 2 = 0.55 is caused by the fact 
that the mean stellar mass of the sample increases (as con¬ 
strained by data from the s82-mgc). In contrast, the fixed 
HOD of R14 must significantly down-sample the overall am¬ 
plitude of the HOD to achieve comparable number densities. 

Finally, our model predicts an evolution of the mean 
halo mass of CMASS, as a function of redshift. More specif¬ 
ically, our models predict that, at 2 = 0.445,0.565 and 
0.685, the mean halo mass of central CMASS galaxies is 
logio(Mhaio[/i“^M 0 ]) = 13.12 (13.15), 13.34 (13.35), and 
13.66 (13.68) for the abundance-matched (age-matched) 
cases, respectively. This variation is driven by the fact that 
mean stellar mass of the sample varies with redshift, as is 
clearly seen in the right panel of Figure 1. These values are 
compared with the HOD result, logj^Q(Mhaio \h~^MQ\) = 
13.51, which is higher (lower) than our results at low (high) 
redshift. In addition, our models predict that the CMASS 
satellite fraction varies with redshift from 12% to 9%, as 
shown in Figure 11. While this effect might seem like a 
small and negligible variation, the fiducial HOD from R14 
constrains the satellite fraction at 6.8 % precision. It is in¬ 
teresting that the value inferred from the single HOD fit in 
R14 is consistent with our values at 2 ~ 0.6 but that not at 
lower redshifts. 

In conclusion, our work suggests that CMASS is a com¬ 
plex sample for which the HODs are likely to vary with red¬ 
shift in a non-trivial manner. A single HOD fit to the overall 
uip broadly agrees with the predictions from our model at 
the median redshift of the sample. However, at lower and 
higher redshifts, our model predicts that HODs are not sim¬ 
ple down-sampled versions of the HOD at the peak of the 
dn/dz. 

7.2 A Cautionary Tale of Modeling Small Scale 
Statistics 

Many previous studies have used a combination of galaxy 
abundances and the projected galaxy two-point correla¬ 
tion function in order to constrain the galaxy-halo con¬ 
nection (e.g., Leauthaud et al. 2011; Coupon et al. 2015; 
Zu & Mandelbaum 2015). However, just because SHAM or 
HOD models can reproduce these observables does not nec¬ 
essarily imply that the models accurately capture the true 
underlying galaxy halo connection, i.e., just because the 
model provides a good fit to the data does not necessar¬ 
ily imply that the model is correct. A clear illustration of 
this statement in the context of mock galaxy samples with 
strong assembly bias is discussed in Zentner et al. (2014). In 
this paper, we have studied two distinct models: standard 
abundance matching and a simplified form of age match¬ 
ing, abbreviated by AbM and AgM, respectively. We have 
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Figure 4. Left): best fit to the S82-MGC SMF for the AbM model (solid black line). The dotted black line corresponds to the SMF 
deconvolved for scatter. The black dashed curved shows the (fixed) <1)2 term in our double Schechter function. Black squares correspond 
to the measured SMF from the S82-MGC. {Right): our best fit to Wp for the AbM model (solid red line). The green line shows the 
result of abundance matching against Mpeak instead of Vpeak- Dashed lines display the contribution to Wp from central-central pairs. 
Numbers in parenthesis indicate satellite fractions (11.1% for Vpeak and 9.5% for Mpeak)- The goodness of fit for the AbM model is 
Ax^ = (4.55 + 11.77)/(26 - 3) = 0.710. 
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Figure 5. Comparison between the CMASS dnjdz from our fidu¬ 
cial mock catalog (red histograms), the measured dnjdz from 
the S82-MGC (blue histograms), and the measured dnjdz from 
the full BOSS DRIO SGC (white histograms, Anderson et al. 
(2014)). Errors on the dnjdz for the S82-MGG are estimated via 
bootstrap. For the DRIO SGC dnjdz, redshift failures and fiber- 
collided galaxies are included using a nearest-neighbor weighting 
scheme (see Anderson et al. (2014)). By construction, our mod¬ 
els reproduce the redshift distribution of CMASS galaxies from 
the S82-MGC catalog which is in turn consistent with the DRIO 
SGC CMASS redshift distribution. The number density from the 
fiducial R14 model is shown as a horizontal solid black line. In 
the R14 model, the CMASS dnjdz is reproduced by randomly 
down-sampling a fixed redshift independent HOD. 


demonstrated that both models can reproduce the galaxy 
SMF as well as Wp, suggesting that there are fundamen¬ 
tal degeneracies among traditional HOD model, AbM, and 
AgM models, in modeling the SMF and Wp. This naturally 
leads to two interesting and inter-related questions. 


Figure 6. Halo mass histograms as a function of redshift from our 
AbM (solid lines) and AgM (dashed lines) mock catalogs. Collapse 
mass at 2 = 0.534 is indicated by a black solid vertical line. 
Clearly, CMASS galaxies populate halos with masses firmly above 
collapse mass. Also note that the mean halo mass of CMASS in 
our mocks varies by a factor of 3.5 from low to high redshift. 

(i) How well do these models predict other statistics derived 
from the data? 

(ii) Are there other statistics which can distinguish between 
these two distinct models? 

Instead of considering just the projected correlation 
function, we turn our attention to the multipoles of the full 
2D correlation function. Figure 12 shows the pseudo mul¬ 
tipoles (see Section 3) for our best-fitting AbM and AgM 
models. The left panel of Figure 12 demonstrates that both 
models fail dramatically to reproduce the pseudo multipoles 
even though both models provide a satisfactory description 
of Wp. In the following section, we will use the redshift de¬ 
pendent clustering of CMASS to argue that in addition to 
stellar mass, galaxy color must play an important role in 
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Figure 7. Impact of age matching (AgM model) on iCp for an 
extreme scenario with /iCMASS = 10 (CMASS galaxies are redder 
than all other galaxies). Rank ordering is performed versus 2:form 
(blue), 2:starve (green) and ^char (cyan). For comparison, we also 
present the best-fit curves from the AbM model (red solid line) in 
which the correlation between the colors of CMASS galaxies and 
subhalo age is completely stochastic. This goal of this figure is 
simply to highlight the qualitative trends of age-matching above 
collapse mass. Rank ordering versus ^^char increases the amplitude 
of iCp whereas rank ordering versus 2form decreases the amplitude 
of iCp. Dashed lines show the contribution to rcp from central- 
central pairs. Numbers in parenthesis indicate satellite fractions. 
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Figure 8. Fractional contribution to Zstarve as a function of Fpeak 
at z = 0.534 for host (square) and sub (circle) halos. The Zchar 
term dominates at the high mass end whereas the Zform term 
dominates at the low mass end. 


determining the clustering of CMASS galaxies and that the 
failure of our model in reproducing the pseudo-multipoles 
must be a consequence of these effects. 

In conclusion, our paper provides a clear cautionary ex¬ 
ample of the limitation of inferring the galaxy-halo connec¬ 
tion from the projected correlation function alone. It is also 
clear from Figure 12 that the pseudo-multipoles contain ad¬ 
ditional information not captured by Wp and that these may 
represent a powerful and under-utilized tool to provide ad¬ 
ditional constraints on the galaxy-halo connection. These 


aspects will be explored in greater detail in a forthcoming 
paper. 


7.3 Redshift Evolution of CMASS Clustering 

As discussed in section 7.1, one major difference between the 
R14 model and this work is the treatment of the redshift evo¬ 
lution of the CMASS sample. In R14, CMASS is assumed to 
be a single homogenous sample with a dn/dz that is modeled 
by down-sampling a redshift independent HOD. In contrast, 
in this paper, the varying number density of CMASS is a 
direct result of the measured mass incompleteness of the 
sample as constrained by the 82-mgc catalog. We now ex¬ 
plore the consequences of these differences by examining the 
redshift dependent clustering of CMASS. 

The original motivation for the non-evolving HOD in 
R14 originates from the observation that the clustering of 
CMASS galaxies does not vary strongly with redshift. This 
is shown by Figure Al in R14 (reproduced here in the right 
two panels of Figure 12). Because randomly downsampling 
galaxies does not modify their clustering, the R14 model 
leads to a constant clustering amplitude with redshift, which 
indeed, seems well supported by Figure 12. However, another 
consequence of this procedure is that the halo mass of the 
CMASS sample is constant with redshift in the R14 model. 
In contrast, the S82-MCG catalog shows that the stellar 
mass of the CMASS sample increases by a factor of 1.8 over 
the range 0.43 < z < 0.7 which leads to a factor of 3.5 
increase in the predicted mean halo mass of CMASS based 
on our SHAM modeling. 

How much redshift evolution should we expect in the 
clustering of CMASS galaxies given this factor of 1.8 increase 
in stellar mass? The right hand side of Figure 12 presents the 
predicted redshift evolution of the pseudo-multipoles from 
our SHAM modeling. We find that the observed stellar mass 
variation of the CMASS sample should lead to more than a 
factor of 1.5 increase in the clustering amplitude over the 
CMASS redshift range^. Figure 12 clearly reveals a fun¬ 
damental contrast between the measured non evolution of 
the clustering of CMASS and the expectation based on the 
redshift-dependent stellar mass distributions. This discrep¬ 
ancy is qualitatively insensitive to the exact details of our 
SHAM methodology. The observed increase in stellar mass 
will lead to a roughy similar increase in halo mass (and hence 
clustering amplitude) independently of the exact halo pa¬ 
rameter (Hpeak or Mpeak) used in the abundance matching. 

We argue that the discrepancy revealed in Figure 12 
suggests, in addition to stellar mass, galaxy color must also 
play an important role in determining the clustering am¬ 
plitude of CMASS galaxies. Figure 5 in L15 demonstrates 
that CMASS galaxies display a range in star-formation his¬ 
tories at fixed stellar mass. At low redshift and fixed stellar 
mass, the CMASS selection function excludes galaxies that 
have experienced recent star formation. At higher redshifts 
(z > 0.6), the CMASS sample is mainly flux limited and in¬ 
cludes a larger range of galaxy colors at fixed magnitude. A 
variety of leasing and clustering studies suggest that, for low 


Notice that we here ignore the redshift evolution in the MDRl 
simulation. However, as seen in Figure. Al, the effect of the red¬ 
shift evolution is at the level of 5%. 
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Figure 9. Left): our best fit to the S82-MGC SMF for the AgM model (solid cyan line). The dotted cyan line corresponds to the SMF 
deconvolved for scatter. For comparison, the AbM result is displayed with red lines. Black squares correspond to the measured SMF 
from the S82-MGC. {Right)-, our best fit to lUp for the AgM model (solid cyan line). For comparison, the AbM result is shown as a red 
solid line. Dashed lines display the contribution to Wp from central-central pairs. Numbers in parenthesis indicate satellite fractions. The 
goodness of fit for the AgM model is = (4.09 + 10.75)/(26 — 4) = 0.674. 


mass galaxies, the clustering of blue galaxies is lower than 
red galaxies at fixed stellar mass (e.g., Tinker et al. 2013). 
It is not trivial that these trends persist in this very high 
galaxy mass regime, but if so, the inclusion of bluer galaxies 
in CMASS at higher redshifts may exactly compensate for 
the increase in the mean stellar mass. In other words, the 
observed constant clustering of CMASS may be due to a co¬ 
incidental compensation between color and stellar mass with 
redshift. 

7.4 What Determines Color in the Most Massive 
Galaxies? 

One of the main goals of this paper is to understand the 
connection between halo properties and the colors of very 
massive galaxies. As shown in Figure 6, CMASS galaxies 
live in halos with halo masses above 10^^ Mq. In this regime, 
gas accretion is thought to be dominated by the “hot halo 
mode” and heated by pressure-supported shocks to a temper¬ 
ature that limits star-formation (Dekel & Birnboim 2006). 
In addition, at these halo masses, “maintenance mode” feed¬ 
back mechanisms, such as radio-mode feed-back, are thought 
to further limit star-formation in the most massive galaxies 
Croton et al. (2006). However, massive galaxies at these red- 
shifts are observationally not all red and dead. The CMASS 
sample in fact contains a blue population (Guo et al. 2013; 
Ross et al. 2013). Based on high-resolution Hubble Space 
Telescope imaging. Masters et al. (2011) estimate that ~ 
25% of the CMASS sample has a late-type morphology as¬ 
sociated with the star-forming disk (Masters et al. 2011). 
Using a maximum likelihood approach that accounts for 
photometric errors as well as the CMASS selection cuts, 
Montero-Dorta et al. (2014) estimate that 37% of CMASS 
object may intrinsically belong to the blue cloud. 

Semi-analytic models (SAMs) sometimes assume that 
galaxy color in high mass halos is a stochastic process. For 
example, Lu et al. (2014) adopts a simplified halo quench¬ 
ing model to mimic the effects of AGN feedback that stops 


radiative cooling in high-mass halos. In this model, radiative 
cooling is randomly switched oft when halos reach a critical 
mass of 10*^^ Mq (with a Gaussian spread of ~ 0.3 dex). In 
Benson (2012), the GALACTICUS model is more sophisticated 
and follows the growth and spins of black holes. The AGN jet 
power is computed from the accretion rates and spins of the 
black hole and is used to counterbalance radiative cooling 
in the hot halo. The parameters of the GALACTICUS model 
are tuned to produce a transition around few 10*^^ Mq in 
halo mass, such that quenching begins above that mass. In 
this sense, quenching will be stochastic at Mhaio ^ 10^^ Mq 
but also depends on the black hole accretion rate and spin. 
In the GALACTICUS model, feedback may also shut down 
temporarily, for example after a merging event with high 
accretion rates which causes the black hole accretion disk to 
transition to a thin (radiative) mode with weaker jet power. 

It is thus interesting to ask what drives color in mas¬ 
sive galaxies which live in halos above 10^^ Mq. Is color a 
stochastic processes that is simply linked to small episodic 
amounts of gas cooling and/or merging events? Or is the 
color in massive galaxies linked to halo properties such as 
halo age and hence perhaps more fundamentally tied to the 
large scale reservoir of fuel and the assembly history? 

Our current paper does not fully account for the color 
selection of CMASS but we can address some of the ques¬ 
tions above. In our AbM model, galaxy color is randomly 
assigned at fixed stellar mass. In the AgM model, on the 
other hand, color is correlated with Zstarve and hence with 
subhalo age. The degree to which color correlates with Zstarve 
is left as a free parameter and determined from the data. We 
have shown that both models can reproduce the galaxy SMF 
as well as Wp but fail to match the pseudo-multipoles. 

To begin, let us focus on the consequences of Figure 12 
in terms of the AbM model. Because the mean stellar mass of 
the CMASS sample increases with redshift, the AbM model 
predicts a strong variation in the clustering amplitude with 
redshift which is clearly ruled out by the data. Hence, we 
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argue that the stochastic color model (i.e., the AbM model) 
can be ruled out with high significance by our analysis. 

We now turn our attention to the AgM model. Onr 
current implementation of the AgM model provides an ex¬ 
cellent description of the SMF and Wp but fails to reproduce 
the pseudo-multipoles. However, unlike in the case of the 
AbM model in which the redshift dependence of the color 
cuts are unimportant, we know that our AgM model will be 
sensitive to these effects which we have treated in a simplis¬ 
tic fashion. In a forthcoming paper, we will investigate if a 
more realistic AgM model which accounts for the color com¬ 
pleteness of CMASS with redshift can describe the redshift 
dependent multipoles. This approach should provide with 
powerful constraints on the physical mechanism that drives 
galaxy color in massive halos. 


8 SUMMARY AND CONCLUSIONS 

The last decade has seen rapid observational progress in our 
understanding of the relationship between galaxies and their 
underlying dark matter halos. However, the connection be¬ 
tween galaxies and dark matter remains poorly constrained 
for massive galaxies with logj^g M* > 11.5 because these 
galaxies are rare with low number densities, and require 
large areas surveys to obtain statistically significant sam¬ 
ples. The BOSS survey provides a spectroscopic data set 
of massive galaxies at intermediate redshifts with number 
densities of h « 3 x 10“'* [(/i“^Mpc)“^] in a survey volume 
that covers several cubic Gigaparsec (the “CMASS” sample). 
This gigantic dataset enables high-signal-to-noise ratio mea¬ 
surements of three dimensional galaxy clustering of massive 
galaxies. 

In this paper, we introduce a novel method based on 
the SHAM framework that can be used to model complex 
populations such as CMASS by accounting for stellar mass 
(and eventually color) completeness as a function of red¬ 
shift. CMASS is referred to as a “constant stellar mass” sam¬ 
ple but L15 demonstrate that CMASS is only truly stellar 
mass limited in a narrow mass and redshift range. In or¬ 
der to fully utilize this sample to understand the galaxy- 
halo connection, it is critical to account for the CMASS 
mass completeness function. Our paper accounts for these 
effects and hence addresses an important limitation of the 
CMASS sample which has typically been neglected in pre¬ 
vious work. Our mock catalogs account for CMASS selec¬ 
tion effects, reproduce the overall SMF, the two-point cor¬ 
relation function of CMASS, and the CMASS dn/dz; the 
HOD table as a function of redshift; all made publicly avail¬ 
able at www.massivegalaxies. com. After submitting our pa¬ 
per, a related effort by Rodrfguez-Torres et al. (2015) was 
brought to our attention. Several key differences between 
Rodrfguez-Torres et al. (2015) and our work include the 
choice of the input stellar mass function, as well as the 
methodology for introducing scatter between stellar and halo 
mass. 

We use data from Stripe 82 to measure the total SMF 
down to logjQ M* > 11.5 and perform a joint fit to both 
the SMF and the projected two point correlation function 
of CMASS galaxies. Our SHAM model (our “AbM model”) 
provides an excellent description of these two observables. 
Previous work has assumed that the CMASS HOD does not 


evolve with redshift. We re-investigate this assumption and 
show that the CMASS HOD should in fact vary strongly 
with redshift. Our model predicts that both the mean halo 
mass and the CMASS satellite fraction should vary with 
redshift. This variation is driven by the fact that the mean 
stellar mass of the sample increases at higher redshifts. In 
conclusion, our work suggests that CMASS is a complex 
sample for which the HODs are likely to vary with redshift 
in a non-trivial manner. 

The color selection applied to the CMASS sample may 
cause the two-point correlation function to be sensitive to 
assembly bias effects. We study the impact of such effects 
on the two-point correlation function using the age match¬ 
ing framework recently introduced by H13. In contrast with 
H13, our sample lies firmly above collapse mass at 2 ; ~ 0.55, 
which corresponds to a relatively unexplored mass range. 
We demonstrate that in this regime, the effects of assembly 
bias are markedly different compared to the ones explored by 
H13 at lower stellar masses. For example, unlike H13, in this 
regime ^starve is dominated by Zchar and not by Zform- Also, 
the Zform component of ^starve causes red galaxies to cluster 
less strongly than blue ones. However, we also find that the 
rank ordering according to Zstarve produces the opposite ef¬ 
fect and causes an increase in the clustering amplitude. We 
show that an excellent £t to the CMASS two-point correla¬ 
tion function (which includes assembly bias effects) can be 
achieved by balancing these two opposing effects. 

Overall, our two distinct models (standard abundance 
matching and age matching) can reproduce the galaxy SMF 
as well as Wp, suggesting at first view a fundamental degener¬ 
acy between these models. However, we show that both mod¬ 
els fail to reproduce the pseudo multipoles even though both 
models provide a satisfactory description of Wp. Hence, our 
paper provides a clear cautionary example of the limitation 
of inferring the galaxy-halo connection from the projected 
correlation function alone. 

We investigate the redshift dependent clustering of 
CMASS and find that the observed stellar mass variation of 
the CMASS sample should lead to more than a factor of 2.0 
increase in the clustering amplitude over the CMASS red¬ 
shift range which is in stark contrast with the data. We ar¬ 
gue that this discrepancy suggests that, in addition to stellar 
mass, galaxy color must also play an important role in deter¬ 
mining the clustering amplitude of CMASS galaxies and that 
the observed constant clustering of CMASS may be due to 
a coincidental compensation between color and stellar mass 
with redshift. Given a discrepancy in shape of the multipole 
correlation function, it may be necessary to consider veloc¬ 
ity bias as recently studied in Reid et al. (2014); Guo et al. 
(2014) in the HOD framework. However, the velocity bias 
between subhalos and galaxies are not well investigated yet 
for the mass scale and redshift range of our interest, and we 
defer this aspect to future work (but see Guo et al. (2016) 
for such an effort against the SDSS main sample). 

Finally, we discuss the physical processes that drive 
galaxy color in high mass halos. We are interested in de¬ 
termining if color in these massive galaxies is a stochastic 
processes that is simply linked to small episodic amounts 
of gas cooling and/or merging events. Or is color in massive 
galaxies linked to halo properties such as halo age and hence 
perhaps more fundamentally tied to the large scale reservoir 
of fuel and the assembly history? The stochastic scenario 
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Figure 10. Redshift dependent CMASS HODs from our AbM (red circles and triangles) and AgM (cyan circles and triangles) mock 
catalogs. The thin black lines in the middle panel correspond to the fiducial R14 CMASS HOD. Note that the virial halo mass in R14 is 
converted to the Rockstar one. The solid line represents centrals and the dashed line represents satellites. Our models should be compared 
with the thick black lines which correspond to the R14 CMASS HOD after down-sampling to match the CMASS dn/dz. Numbers in 
parenthesis represent the percentage of CMASS galaxies in each redshift bin compared to the full sample. The data of the HOD table as 
a function of redshift will be made publicly available at www.massivegalaxies.com. 





redshift 


Figure 11. Redshift evolution of the satellite fraction predicted 
from our AbM (red squares) and AgM (blue squares) models. 
The redshift independant satellite fraction from R14 is shown as 
a horizontal black line. The grey shaded region indicates the Icr 
error on the R14 satellite fraction. The satellite fraction in our 
SHAM models evolves with redshift and is only consistent with 
R14 at 2 ~ 0.6. 


corresponds to our AbM model in which galaxy color is ran¬ 
domly assigned at fixed stellar mass. Because the compari¬ 
son of the redshift dependent clustering of CMASS with our 
AbM model, we argue that the stochastic color model can 
be ruled out with high significance by our analysis. In this 
case, color in high mass halos may be linked to other prop¬ 
erties besides halo peak velocity, suggesting that assembly 
bias effects may play a role in determining the clustering 
properties of this sample. 

Our current implementation of age-matching also fails 
to reproduce the pseudo-multipoles. However, unlike in the 
case of the AbM model in which redshift dependence of the 
color cuts are unimportant, we know that our AgM model 
will be sensitive to these effects which we have treated in 
a simplistic fashion. Hence, in a forthcoming paper, we will 
characterize the CMASS color distributions in greater detail 


and investigate if a more realistic age-matching model can 
describe the CMASS pseudo-multipoles. This approach will 
provide powerful constraints on the physical mechanisms 
that drives galaxy color in massive halos. 
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Figure 12. Left panel: Comparison between the measured CMASS pseudo multipoles from R14 and the prediction from our AbM and 
AgM mock catalogs. Solid lines correspond to the pseudo monopole and dashed lines correspond to the pseudo quadrupole. Neither the 
AbM or the AgM model are able to reproduce the BOSS measurements. Note that our errors on the pseudo multipoles looks smaller 
than ones in R14 because only measurement errors are included here. Middle panel: redshift evolution of pseudo multipoles in the AbM 
model prediction are shown as a fractional difference with respect to the measurement for the full sample. Red, blue, and green squares 
correspond to BOSS measurements in three different redshift bins. The measured BOSS pseudo multipoles display almost no variation 
with redshift. In stark contrast with the BOSS measurements, our models (solid colored lines) predict a significant evolution in the 
pseudo multipoles, driven by the fact that the mean stellar mass of CMASS increases by a factor of 1.8 over the range 0.43 < z < 0.7. 


State/Notre Dame/JINA Participation Group, Johns Hop¬ 
kins University, Lawrence Berkeley National Laboratory, 
Max Planck Institute for Astrophysics, Max Planck Insti¬ 
tute for Extraterrestrial Physics, New Mexico State Univer¬ 
sity, New York University, Ohio State University, Pennsyl¬ 
vania State University, University of Portsmouth, Princeton 
University, the Spanish Participation Group, University of 
Tokyo, University of Utah, Vanderbilt University, University 
of Virginia, University of Washington, and Yale University. 


APPENDIX A: TESTS OF THE SUBHALO 
CATALOG 

In this appendix, we discuss potential issues in the subhalo 
catalog, focusing in particular on the time evolution of sub¬ 
halo clustering and completeness issues due to the resolution 
of the simulation. 

We begin by testing if a single redshift output is suf¬ 
ficient to model CMASS over the redshift range of 0.43 < 
z < 0.7. We rank order subhalos by Upeak and select the 
top N subhalos with a number density of n ~ 1.58 x 
10“"^(/i“^Mpc)“®. This value roughly corresponds to the 
number density of galaxies with \ogiQ{Mt,/M q) > 11.0. Fig¬ 
ure Al shows the three-dimensional correlation function of 
subhalos in real space as a function of separation at three 
different redshift outputs and at fixed number density n. 
The correlation function varies by at most 5% compared to 
z = 0.534 over the CMASS redshift range. The fractional 
difference at large scales, r > 3/i“^Mpc, is 1-2 %. The 
largest differences (at the level of 5%) are seen at the tran¬ 
sition regime from the 2-halo to 1-halo term, r < 1 /i“^Mpc, 
where the errors on our observational clustering signal are 
increased by uncertainties due to the fiber-collision correc¬ 
tion. In future work, especially when the S/N of the mea¬ 


surements increase (currently we are using DRIO measure¬ 
ments), these effects will need to be taken into account. 

We perform two tests concerning the impact of the 
resolution of MDRl on our results. Based on White et al. 
(2011) and also R14, we estimate that abundance matching 
for CMASS will require subhalos with Upeak ^ 200kms“^. 
Figure A2 presents the histogram of subhalos as a function 
of Upeak- This histogram starts to deviate from a power 
law at Upeak ~ 200kms“^ and has a clear turnover at 
Vpeak ~ 150kms“^. Figure A2 demonstrates that MDRl 
has a sufficient resolution for CMASS, although a higher 
resolution would be preferable. 

However, Figure A2 does not guarantee that the res¬ 
olution is sufficiently high to trust our clustering predic¬ 
tions down to arbitrarily small scales. Our clustering sig¬ 
nal is dominated by central-satellite pairs in the 1-halo term 
regime, implying that it is important to study the complete¬ 
ness of subhalos as a function of distance to their host-hosts, 
Rsub- Because the true radial profiles of subhalos remain 
poorly known, it is difficult to precisely characterize the 
radius at which incompleteness effects become important. 
With this caveat in mind, Behroozi et al. (2013b) define the 
radius at which subhalo detections are incomplete as the ra¬ 
dius where the logarithmic slope of the profile becomes larger 
than -1.5 (or -1.7). This cut-off is motivated by the density 
profiles of observed subhalos in the maxBCG cluster catalog 
(Tinker et al. 2011). Figure A3 displays the radial profiles 
of subhalos for different ratios of Upeak, Msub = Upg^k/Up^eak: 
and for three different bins in host halo mass (but divided 
by Upeak). In general, this radial profile becomes gradually 
shallower at smaller Rsub due to the fact that density con¬ 
trast between the parent halo and subhalos decreases in 
the inner regions of halos and subhalos become more dif¬ 
ficult to detect. Using the Behroozi et al. (2013b) criterion, 
we estimate that subhalo detections become incomplete at 
0.1-0.7/i“^Mpc, depending on /Xsub and Mhost, as shown 
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Figure Al. Time evolution of the clustering of subhalos at fixed 
number density. Subhalos are inversely sorted by Vpeak ^ 
is imposed at a number density of n ~ 1.58 x 10~‘^(h/Mpc)^. 
The upper panel shows a comparison of the three dimensional 
correlation function of subhalos in real space at different redshift 
outputs; 2 = 0.436 (red), z = 0.534 (default, black), and 2 = 
0.609 (blue). The lower panel presents a fractional difference of 
the correlation function with respect to the one at the default 
2 = 0.534 output. 



Figure A2. The histogram of host halos (blue square), subhalos 
(green triangle), and all halos (red circle) as a function of Vpeak- A 
clear turnover around Vpeak ~ 150kms~^ suggests that subhalos 
with Vpeak ^ 150kms“^ are not affected by resolution. 

in Figure A4. The smallest scale in our Wp measurement 
is 0.2/i“^Mpc and is indeed close to the incomplete¬ 
ness limit. We can definitely improve this situation by using 
higher resolution simulations. However, we expect that the 
impact of the resolution on our results should be relatively 
small, since the errors of our measured Wp on these scales 
are boosted by systematic uncertainties in the fiber colli¬ 
sion correction. We conclude that the resolution of MDRl is 
sufficient for our purpose, but higher resolution simulations 
would be preferable and will be adopted in subsequent work. 


APPENDIX B: IMPACT OF THE SCATTER IN 
THE VTeak-M* relation FOR THE 
AGE-MATCHING MODEL 

To study the impact of the assembly bias effect, we adopt 
the age-matching model where we reorganize the relation 



Figure A3. Radial profiles of subhalos. Different colors corre¬ 
spond to different values of psub = ^peak/^peak' samples in 
terms of psub ^re created to contain an equal number of subhalos. 
The solid, dashed, and dotted lines correspond to three different 
bins in host halo mass (but divided in terms of Vpeak)- As a ref¬ 
erence, the logarithmic slope of —1.5 is also shown. The radius 
at which subhalo detections are incomplete is estimated as the 
radius where the logarithmic slope of the profile becomes larger 
than -1.5. The vertical black dashed line shows the minimum scale 
in our clustering measurement. 



Figure A4. Subhalo incompleteness radius as a function of fisuh- 
Different colors indicate different bins in host halo mass. Circles 
with solid error bars show the results when the incomplete radius 
is defined with respect to a logarithmic slope of -1.5. Squares with 
dashed error bars represent the results when the incomplete radius 
is defined with respect to a logarithmic slope of -1.7. The horizon¬ 
tal black dashed line shows the minimum scale in our clustering 
measurement. Higher resolution simulations would be preferable 
and will be adopted in forthcoming paper. 

between subhalo age and galaxy color at fixed stellar mass 
rather than at fixed halo mass (or Vpeak). This is because 
we can perform rank-order matching only against observ¬ 
able quantities. However, because our study operates in the 
very steep end of the stellar mass function, we must ver¬ 
ify that our results do not depend on the stellar mass bin 
width when performing age-matching. In H13, the authors 
report that their analysis is insensitive to a stellar-mass bin 
width of AlogM* = 0.05-0.2. In this appendix, we per¬ 
form a similar exercise to H13 for our extreme age-matching 
model (see section 6.3). In the following, we demonstrate 
that our results are insensitive to our fiducial bin width of 
AlogM* = 0.05. However, we also show that the choice 
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Figure Bl. Testing our stellar-mass bin width in performing the 
age-matching model. We perform the age-matching model for the 
extreme case as discussed in section 6.3 but in terms of Fpeak itself 
as a halo-age proxy. Note that the best-fitting values in the simple 
age matching, ((/u, logjQ Mq, <t) = (1.86 X 10“®, 10.89, 0.105), are 
adopted here. lUp with the different bin size are shown in blue 
for AlogM* = 0.05 and in cyan for 0.005, respectively. These 
results can be compared with the age-matching one (red) where a 
clear discrepancy with blue or cyan curve is confirmed. The age¬ 
matching model with Zstarve is also shown just for a comparison 
with figure 7. 


of a fiducial bin width needs to take into consideration the 
scatter (in our case cr = 0.105). 

We perform a test in which we consider the extreme 
age-matching model in section 6.3 but we reshuffle with 
respect to Vpeak rather than ^starve. In addition, we test 
how the results vary if we use a different bin width. Fig¬ 
ure Bl demonstrates that our results are insensitive to this 
change in bin width (A log M* = 0.05 (blue) and A log M* = 
0.005 (cyan)). We have also checked that our mean halo 
masses and satellite fractions are unchanged when going 
from AlogM* = 0.05 to AlogM* = 0.005. 

Nevertheless, figure Bl shows a clear difference between 
the simple abundance matching (‘AbM’, red) and the ex¬ 
treme age matching results (‘AgM-lfpeak’, blue or cyan). In 
fact, the mean halo mass and the satellite fraction for the 
AbM (AgM-I/peak) models are log(Mvir [Mq/i”^]) = 13.442 
(13.551), and /sat = 11.08% (9.12%), respectively. We ar¬ 
gue that this difference originates from the non-zero scat¬ 
ter in the Fpeak-M* relation in the abundance matching. 
In performing the extreme age-matching model with Ipeak, 
CMASS galaxies with larger Acoi at fixed stellar mass are 
likely to have larger Fpeak. 

Our argument is confirmed by figure B2 where we per¬ 
form the same exercise but we set cr = 0. In this case, we 
find that our clustering prediction becomes stable with bin 
widths smaller than AlogM* = 0.01, and that AbM result 
is similar to the AgM-Fpeak one (compare red with magenta 
lines). In figure B2, we also display the AgM model with 
a variety of halo-age indicators (see section 6.3). The halos 
masses of the blue, green and cyan curves are very simi¬ 
lar (log(Mvir [Mq/i”^]) = 13.513). Hence, differences in the 
clustering for the blue, green and cyan curves are a conse¬ 
quence of assembly bias effects. 


Figure B2. Testing the impact of the scatter in the Ipeak-M* re¬ 
lation on the age-matching model. Here we fix the bin width with 
AlogM* = 0.01 and do not introduce the scatter, i.e., cr = 0. In 
this case, the abundance matching (red) and the age matching 
with Vpeak (magenta) result in identical clustering. As a compar¬ 
ison, the age-matching results with *;form (blue), Zstarve (green), 
and Zchar (cyan) are also plotted to manifest the pure assembly 
bias effect in absence of the scatter. 
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